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Abstract 

Let A he a random subset of Zat obtained by including each element of 
Z AT in A independently with probability p. We say that A is linear if the 
only Preiman homomorphisms are given by the restrictions of functions of 
the form f{x) = ax + b. For which values of p do we have that A is linear 
with high probability as — > oo ? 

First, we establish a geometric characterisation of linear subsets. Sec- 
ond, we show that ii p = o{N~^^^) then A is not linear with high prob- 
ability whereas if p = for any e > then A is linear with high 
probability. 

1 Introduction 

Frciman's structure theory of set addition constitutes now one of the most gen- 
eral and powerful tools in additive combinatorial number theory. The essential 
concept of this theory is what is now known as Freiman homomorphism in the 
literature. 

Let A C Zjv and let (p : A ^ W^n he some function. 

Definition 1.1. We say that f is a Freiman hom,omorphism, if whenever a 
quadruple a,b,c,d G A satisfies a — b = c — d then (j){a) — (j){b) = (j){c) — (j){d). 

Clearly if / : Zjv — > Zjv is of the form f{x) = ax + b, that is the translate of 
a group homomorphism, then the restriction /|^ is a Preiman homomorphism. 
We will refer to functions of the above form as linear. 

For the sake of simplicity assume now that iV is a prime. Then it is easy 
to see that the space of Freiman homomorphisms from A to Zjv, denoted 
HomF(A, Z]v), is a vector space over the field F = Zat. We consider the notion 
of Freiman rank or Freiman dimension of A: 



rank{A) = dimHomp'(A, Z^v) — 1 



Observe that 1 < rank{A) < \A\ — 1. The intuition here is that the size of 
the rank should give some form of measure of the additive structure of A. For 
example if A is an arithmetic progression i.e A = {ao + i ■ d : < i < I ~ 1} 
then the only Freiman homomorphisms are given by the restrictions of linear 
functions to A and hence rank{A) = 1. If A C Zjy has Freiman rank 1 we say 
that the set A is linear. 

On the other side of the additive spectrum, we could pick A to be a Sidon set] 
that is a set where the only quadruples (a, 5, c, d) G A^ such that a + b = c+d are 
the trivial ones. The classical example of a Sidon set is the set {1, 2, 4, . . . , 2*^ : 
k < log2 {N/2)}. In this case the restrictions for a function to be a Freiman 
homomorphism are essentially empty so in fact any function </) : yl — )• Z^r is a 
Freiman homomorphism and therefore the Freiman rank is as large as it can be, 
namely rank(A) — \A\ — 1. 

It is possible to extend the definition of Freiman rank for N not prime or 
indeed any abelian group. However we will be only considering sets of Freiman 
rank 1 for which we can give a simple independent definition. 

Definition 1.2. We say that a set A C Zjv is linear if and only if the only 
Freiman homomorphism are given by the restrictions of linear functions. 

We will now state the main result of this paper: 

Theorem 1.1. For any e > 0, let A be a random subset of Zjy where each 
X G Z^r is chosen independently with probability p = Af^2+<^. Then with high 
probability A is a linear set. Furthermore if p = o{N^^^^) then with high prob- 
ability we may find non trivial Freiman homomorphisms on A. 

We take the opportunity to give a quick proof of the lower bound in Theorem 
1.1 We claim the following holds: 

Claim. Let A C Z^r with |A| > 3 and suppose there exists some G A such 
that no {x,y,z) in A'^ satisfies x + y = z + xq. Then we may construct a 
non-trivial Freiman homomorphism / : A — > Zjv . 

Proof. Set B — A — Xq. Since B is simply a translate A we will be done if we 
can construct a non-trivial Freiman homomorphism on B. Note that £ B and 
there are no triples {x, y, z) in B such that {x + Xq) + (y + a;o) = (z + xq) + Xq, 
which is to say x + y — z. In particular the only additive quadruples involving 
the element G B are the trivial ones, that is of the form -I- a; = a; -I- 0. 

Now define f : B ^ as 
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Provided |-B| > 3 the function / is not the restriction of a hnear function and 
furthermore it is easy to see that it indeed defines a Freiman homomorphism on 
B. □ 



Thus it is sufficient to show that for p = o{N~'^/'^) we may find such xq & A 
with high probabihty. Let X be the random variable given by the number of 
additive quadruples in A^. Clearly the number of a; e A that are not possible 
candidates for the above xq is at most X. By Markov's inequality we have that: 

m > \np) < 2^E (X) < TVV = 0(1) (1.1) 

since E {X) < N^p'^. On the other hand the number of elements in A is given 

by a Bin(A^, p) binomial random variable which wo know to be strongly concen- 
trated around the it's mean provided that p = oj{jf), in particular 

¥i\A\ < ^Np) = 0(1) , 

therefore with high probability \A\ > X and it follows that there must exist 
xo & A with no triple (a;, y, z) in satisfying x + y = z + xq. 

2 Setting and basic observations 

Let A c Z„ and f : A ^ Zjv be a Freiman homomorphism. For the sake of 
simplicity, wc will assume throughout the following sections that A — A = Z^r 
and that S A. These assumptions become immaterial when we return to the 
probabilistic setting since, provided p = u){N~^/^), then A — A = Zjv with high 
probability and A is linear if and only any translate of A is also linear. 

Definition 2.1. The induced function off, (j}f : Zjv — Zjv is given by: 

4>f{d) = f{x + d) — f{x) where x,x + d G A 

Note that (pf is well defined since f{x + d) — f{x) — f{y + d) — f{y) for any 
x,y G A since / is a Freiman homomorphism. We will refer to the induced 
function simply as (/> unless further clarification is required. 

Here is the key property of the induced function: 

Proposition 2.1. A Freiman homom,orphism f on A is linear if and only if 

the induced function (j) is a group homomorphism. 

Proof. Since ^ is a group homomorphism it satisfies that (j){d+d') = (j){d) + 4>{d') 
for all d,d' ■ Hence / is linear as for any x G A 

fix) - /(O) = ^{x) = </>(x - 1) + 0(1) = . . . = x<^(l) 

The converse is clearly true. □ 
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We are interested in understanding which structural properties of A would 
guarantee that (/> is a linear function. A first simple observation is the following: 

Observation 2.1. Whenever A contains a triple of the form x,x -\- d,x + d + d' 
it follows that 4>{d + d') = (f>{d) + 4){d') since 

J{x + d + d') - fix) - {fix + d + d')- fix + d)) + {fix + d) - fix)) 

<j>id') + 0(d) 

We will say, for convenience, that such a pair (d, d') is additive. 
We also have a sort of converse for this observation: 

Proposition 2.2. Suppose that ip : T^n ^ is a function such that 
Hdi+d2) = V(rfi)+0(rf2) 

whenever the pair ((ii,c?2) is additive. Then there exists a Freiman homomor- 
phism f : A "Zn such that (f)f = ip 

Proof. The set of additive pairs is invariant under translations and so are Freiman 
homomorphisms so we may assume without loss of generality that E A. 

Set / :A— j'ZjvtobeV'U- Firstly we need to check that that ip preserves 
additive quadruples and hence is a Freiman homomorphism: note that each 
quadruple may be expressed as ix,x + di,x + d2,x + di + d2) G A"*. Now we 
exploit the fact that ip satisfies all additive pairs: 

i^ix + di) = tpix) + ijj {di) a.s iO,x,x + di) e A^ 
'4>ix + di+d2) -ipix) + ijj {di + d2) as iO,x,x + di + d2) e A^ 

^Pid + d') ^ ^pid) + ^/jid') a.a ix,x + di,x + di + d2) e A^. 

Therefore ipix + di + ^2) + tpix) = i^ix) + tpidi) + ■0(^2) + ipix) = tpix + di) + 
ipix + d2) as required. Now recall that <pfid) = fix + d) — fix) where x, x + d e A 
and so (pfid) — tpix + d) — ipix) — ipid) since the triple (0, x,x + d) E A'^ □ 

Hence the simplest condition one could ask to force to be a homomorphism 
is that for each pair of differences (d, d') S Zj^ we can find a triple of the form 
ix,x + d,x + d + d') G A^ or in other words that every pair {d, d') is additive. 
However, if we have the probabilistic setting in mind, the probability of finding 
such a triple for d,d' ^ is at most Np'^ by the trivial union bound and thus 
for p <^ N~^/'^ this event will occur with small probability. If we are to have 
any hope of showing Theorem 1.1 we ought to look at things in greater detail. 

The second observation is that in fact we need not ask for all pairs (d, d') 
to be additive in order to conclude that is a homomorphism. For instance, if 
we are already given that the pairs (3, 1), (2, 1) and (1, 1) are additive we could 
deduce that 0(2 + 2) = 0(3 + 1) = 0(2) + 0(1) + 0(1) = 0(2) + 0(2) which is to 
say the pair (2, 2) is additive. 

This suggests that we look at how much information we can extract from 
additive pairs. More explicitly, we would like to answer the following question: 
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given a fixed set of additive pairs, which equahties other than the trivial ones 
may we deduce from it? It turns out that this problem can be interpreted 
as determining whether a given word is the trivial element in a certain group 
presentation; it is not surprising that in order to tackle it, it will be helpful to 
introduce a notion very much analogous to that of a Cayley complex. 

3 A topological space 

Definition 3.1. Let Ca be the 2-dimensional cell-complex constructed from A 
as follows: 

• The vertices of Ca are simply the elements of Z^v 

• The 1-cells are given by all directed edges x y for x ^ y 
and are labeled by d where d — y — x 

• Whenever three edges with labels di,d2, d^ form an oriented triangle T in Ca 
(which implies that 1^1+^2+^3 = 0) and there exists a triple of the form 
{x,x + di,x + di + ^2) G we add a 2-cell with T as its boundary and we 
label it [di, 1^2, da] 

Remark 1. The 1-skeleton of is the complete Cayley graph on Z^r. The use 
of d to label the edges may seem strange to the reader; the reason behind it is 
that we will be considering the chain group given by the edge labels and we wish 
to make clear the distinction between d, an element of Z^r, and d, an element 
of the chain group. 

Remark 2. Usually when dealing with simplicial complexes one denotes a spe- 
cific simplex by its vertex set but since our definition of a 2-cell is invariant 
under any translate it is more appropriate to use the labels of edges bounding 
it. Also we will make no distinction between a specific 2-cell and its correspond- 
ing label since all the properties we are interested in are translation invariant 
and two distinct 2-cells carry the same label if and only if there is a translation 
mapping the vertices of one to the other. 

The crucial aspect of the construction is the following: 

Proposition 3.1. Whenever a 2-cell [di, d2, da] is in Ca then any induced func- 
tion <p must satisfy. 

^{di) + ^{d2) + 4>{d3) = (★) 

Proof. Since [di, d2, d^] is a 2-cell in Ca we know that di + d2 + da = 0, which 
is to say —da = di + d2 and that there exists a triple [x^x -\- di,x + di + d2) in 

A3. 

By observation 2.1 we can conclude that any induced function must satisfy 
0(di + d2) = 0(di) + 4>{d2), furthermore, any induced function must also verify 
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that 4){—d) = —(f>{d) and hence 



0(d3) + 0(d2) + 0(di) = 4>{-{di+d2))+cb{d2)+cb{d,) 

= -(0(rfi)+0(d2))+0(rf2)+0(dl) 

= 



□ 



We will say that a — (di, d2, . . . , d/) is a cycle in Ca whenever 9(a) — 0, where 
d denotes the boundary function. Note that this coincides with the graph- 
theoretic notion of cycle. 

Proposition 3.2. Let a = {di,d2,... ,di) be a cycle inCA- If Oi belongs to the 
trivial homology class then any induced function </> must satisfy 

4>{di) + 4>{d2) + ... + 4>{di)^{) (3.1) 

Proof. A cycle is in the trivial homology class if and only if we can express it 
as the boundary of a collection of 2-cells. Hence 

a = di + . . . + di ^ d\^ ^ (Tj j = ^ dffj 

^ ^ (3.2) 

= XI d[dji,dj2,dj3] = X (dji + dj2 + rfjs) 

j 3 

And so 

(/-(di) + (t>{d2) + . . . + <i>{di) = J2 {^(^31) + <t^idj2) + 0(dj3)) = 

3 

as each term in the brackets must add up to by (★). □ 

Remark 3. The additions in (3.2) take place in the chain group of paths, that is 
we only allow cancellations of the form d + {—d) = 0. Again since any induced 
function satisfies (j){—d) = —(j){d) for any d we are safe. 



Corollary 3.1. Let A be a subset of TLj^ such that A — A ^ TIjn , md let Ca 
be the cell-complex defined above. Suppose that Ca has a trivial first homology 
group. Then every Freiman homomorphism f : A I^n is the restriction to A 
of a linear function. 

Proof. Any cycle in Ca has trivial homology class. In particular, for any choice 
oi d,d' e Zjv the cycle {d,d' ,—d ~ d') has trivial homology class and so by 
Proposition 3.2 we have that the induced function satisfies (j){d + d') — (f){d) + 
4>(d'). Hence <j) \s & homomorphism and the result follows. □ 
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What is the principle behind this construction? Recall that our ultimate 
aim is to understand what the space of possible induced functions </> looks like. 
We know that, whenever a pair (di,d2) is additive, any such a function must 
satisfy (f>{di + ^2) = 4'{di) + ^(^2) so it is natural to turn our attention to this, 
a priori, larger space of functions: 

Let F be the space of all functions : Zjv Zat such that (/)(0) = and 

0(di+d2) = (l){di) + (I){d2) 
whenever the pair [di,d2) is additive. 

This is a submodule (over Z^r) of the free module M = Zj^^^ of all functions 
/ : Zat — )■ Ztv such that /(O) — 0. We may take as a basis the elements 
ei, . . . , Cm -I where the Cj := are the indicator functions taking the value 1 
at j and otherwise. 

Consider the subgroup 

i3 := (^Cd^ + + : [di,d2,c^3] e 

For 0,-0 € M we may define a non-degenerate symmetric bilinear form, 
analogous to an inner product, by 

(</.,^) = 

It is easy to see that i/i e if and only if (0, f)—0 for all f £ B or, using the 
vector space notation, T = . We will make use of the fact that 

\M\^\B^\\B\ (3.3) 

On the other hand we have that B is, by construction, isomorphic to the 
group generated by the boundaries of the the 2-cells of Ca ■ What is the group 
generated by the cycles? 

Let G = Zj^^^ be the free module generated by the edge labels and consider 
the homomorphism -0 : G — )■ Zjv such that i!{d) = d for all d £ Ca- The 
map is clearly surjective and an element x G G is a cycle in Ca if and only if 
X € ker ip. Hence, using the classification theorem of abelian groups, it follows 
that ker ip 

Therefore the first homology group of the cell-complex Ca is isomorphic to 

This yields a more abstract proof of Proposition 3.2 : If has trivial first 
homology group then B = Zj^^^. Now, since \A4\ — \B-^\\B\ it follows that 
{B-^l = N and so = B-^ = Z^r as we already now that certainly contains 
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the subgroup of linear functions. Hence T is precisely the subgroup of linear 
functions. 



The advantage of this formulation is that it is now simple to show that the 
converse of Proposition 3.2 also holds: 

Proposition 3.3. Let A C Z^v o,nd Ca be as above. Suppose that Ca has a non 
trivial first homology class. Then there exist a non linear Freiman homomor- 
phism f : A ^ . 

Proof. If Ca has non trivial first homology group then K is a proper subgroup of 
Z%-^. In particular \B\ < iV^-^ and from (3.3) it follows that \T\ = \B^\ > N. 
Hence, using Proposition 2.2 we can conclude that the set of Freiman homo- 
morphisms / : A — > Z^r with /(O) =0 is strictly greater than N and thus there 
must exist one that is not linear. 



Remark 4. If we consider homomorphisms / : Z^v Z instead the above 
arguments carry through and give a stronger result: if the cell-complex Ca has 
trivial first homology group then the only Freiman homomorphisms are the 
constant functions. 

4 A family of surfaces 

The previous section has provided us with an accurate geometric description of 
the structural properties that are necessary and sufficient for a set to have only 
linear Freiman homomorphisms. 

This suggests the following strategy: 

1. Pick an orientable surface H together with a triangulation A(H) that has 
a triangle T as boundary. 

2. Fix an oriented triangle in [a, b, c] G Ca 

3. Attempt to embed A('H) in Ca in such way that T gets mapped to [a,b,c] 



and any triangular face of A('H) gets mapped into some 2-cell of Ca. 

If we can do so then we have explicitly shown that the homology class of the 
oriented triangle [a,b,c] is the trivial one and hence, by Corollary 3.2, the pair 
(b—a,c—b) is additive. The aim is to estimate carefully what is the probability 
that this process succeeds for all choices of [a, b, c] . 

Remark 5. We may assume henceforth that a,b,c are distinct as it follows 
immediately from the definition and the fact that A — A = Z^v that pairs of the 
form (0, d) and (— d, d) must be additive. 

For instance if we set H to be a single triangular face then embedding T-L is 
precisely demanding that the pair (6 — a, c — a) is additive. In other words, with 



□ 
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this particular choice of "H, the strategy above is the same as trying to show that 
every pair is additive, which, as we have seen previously, is a sufScient condition 
to guarantee that all induced functions 4> ai'e linear. 

The hope is that, by choosing more complex T-L, we will be able to improve 
the range of values of p for which this embedding strategy is successful with 
high probability. 

We will now consider in detail a particular sequence of simplicial complexes: 

Uo = [a, b, c] 

%\ = [a, b, z] [a, z, c] [z, b, c] 

and in general Hi+i is obtained from Hi by taking each 2-simplex [a;i,a;2,a;3] e 
Hi and subdividing into three new simplexes [a;i,a;2,a;][a;i,a;,a;3][a;,a;2,a;3]. 




[a,b,z][a7,c][z,b,c] [a,b,z'][a,z',z][z',b,z] 

[a^^"][a,z",c][zV,c] 
[z,b,z"'][z^"',c][z"',b,c] 



Figure 1: Sketch of Hi and H2 



We will show that the family {Hi)i>i is a suitable family of simplicial com- 
plexes. That is to say that, for any p = iV^^/^+'^ with e > 0, we can find some i 
(depending only on e) such that with high probability, for all oriented triangles 
[0,6, c], we can embed Hi in Ca with boundary [5,6, c]. 

In order to do this it will be helpful to express the above statement in an- 
alytic terms; it will allow us to bring in powerful tools of probability theory 
concerning the concentration of random variables about their mean. 

Let us consider the case when i = 0. The only way we can embed Ho in Ca 
with boundary [a, b, c] is if the oriented triangles is the boundary of a 2-cell in 
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Ca i-C- if and only if there exists a triple of the from {x,x + a,x + a + b) E A^. 
In terms of the characteristic function 1a this statement is equivalent to 

where ti . . An are Boolean variables with 



1 \i X E A, 
if otherwise 



Once we have the first instance, it is simple to give a recursive expression for 
the subdivisions. 

Lemma 4.1. Define the family of polynomials 

^a.b,c ~ ^ ^ tx+atx+btx+c 

xez„ 

Ai+l _ \ ^ Ki Ki Ai 
^^a.b,c ~ ^^a,b,z-'^a,z,c-^^z,b,c 

zez„ 



Then the complexHi can be embedded inCA with boundary [b ~ a,c — b,c — a] 
if and only if 

Aa,6,Jii,... ,tN]>0 

Proof. The lemma follows from a simple induction argument. Clearly ^ ^ > 

if and only if there exists a triple x + a,x + b,x + c G A which is to say 
[b — a,c — b,c — a] G Ca- Assume the result holds for i and notice that Hi+i can 
be embedded in Ca with boundary [b ~ a, c — b^ c — a] if and only if there exists 
some z € Zjv such that Hi can be embedded with boundary [b — a, z — b, z — a]., 
[z — a, c — z^ c ~ a] and [b — z,c — b, c — z\ . By the induction hypothesis this 
occurs if and only if ^ ^A^ ^ ^A* ^ ^ > for some z G i.e if and only if 

□ 

Hence, by Corollary 3.1, the main result will be proven if we can show the 
following: 

Theorem 4.1. Let A be a random subset ofL^ where each x € Z„ is chosen 
independently with probability p — N^2+'^ for any e > 0. Let A^ ^ ^ with (a, 6, c) 
distinct be the family of polynomials defined above. Then there exists some 

1 = i{e) such that: 

nKAc > K dll (a, b, c)) = 1 - on{1) 

The following sections aim to prove Theorem 4.1. Before we start tackling 
the proof, however, we need to introduce a powerful tool developed by V. Vu 
concerning the concentration of multivariate Boolean polynomials. 
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5 Boolean polynomials 

Boolean polynomials are objects that arise very naturally in probabilistic com- 
binatorics as a method of counting 'small' structures. They often turn out to 
be highly concentrated around their means. 

The classical setting is that of random graphs G{n,p) on the vertex [n], 
where each edge ij in the graph is chosen independently with probability p. In 
this instance our boolean variables arc given by tij for i < j where Uj = 1 if 
the edge ij is in the graph and otherwise. If we are interested in counting 
the number of copies of a given small graph K we can look at its corresponding 
Boolean polynomial; for example, the number edges in G (which we may think 
of as the number of copies of the graph with 2 vertices and one edge) is given 
hy Y = J2i<j ^ij- this very particular case we may use a well known result 
of Chernoff which states: 

Theorem 5.1 (Chernoff). Let Y = ti where ti are independent Bernoulli(p) 
random variables. Then for any A > 



Of course, one could readily apply this whenever the polynomial we are 
considering has the property that all terms in the summand are independent, 
but this is not usually the case. This bound was generalised by Azuma: 



Theorem 5.2 (Azuma). Let Ej [Y) = E (Y\ti,. . . ,tj) and set dj{ti, ■ ■ ■ ,tj) = 
Ej (Y) - Ej_i (F). Then for any X>0 



where \\di\\oo is the maximum value of di over all possible values ofti, . . . ,ti. 

In the case where the ||di||oo are very small in comparison to the expectation 
Azuma's bound is ideal for showing strong concentration. Unfortunately, there 
are many examples where this fails. 

Suppose we are interested in the number of triangles in G. That is we 
are interested in the value of the random variable Y = J2i<j<k^ij^3ktik- The 
expectation of Y is Q{N^p^), and each edge lies in at most n — 2 triangles so in 
this case HdiUco < n — 2 for all i but more importantly, no matter which ordering 
we choose for the variables tij, we have that ||ci;ast||oo = {^—p){n — 2). Azuma's 
bound will only yield useful information if E (Y) » n^^^ so if p >> n~^^^. 
On the other hand it is a very rare event that a particular edge lies in n — 2 
triangles: the expected number of such triangles is about n^p^. Furthermore we 
can use Chernoff to show that this number is strongly concentrated around its 
mean. The intuition here is that one should not look at the maximum value of 
di but rather at its expectation to obtain concentration for smaller ranges of p. 
This is essentially what Vu's result manages to achieve but before we state the 
theorem we need to introduce some terminology. 



p(|y - E (y) I > VXN) < 2e-i 




11 



The first thing to point out is that since we are deaUng with {0, 1} valued 
variables the polynomials tft2tl and tit2t3 take identical values, so without loss 
of generality we may assume that each variable has degree at most 1. To be 
more precise, every Boolean polynomial P has a unique reduced form as 

P[h,...,tN] = wi.B)tB 

where A is a family of subsets of [TV], Ib = YlieB^i ^^'^ w : A ^ ^ is some 
weight function. We say that P is positive if w > 0. 

Definition 5.1. Given C C [A^] and a Boolean polynomial P = ^BeA^^ let 

m{C,l;P)= J2 «'(^) 

BeA 

CCB, \B\=l 

in other words m(C, /; P) counts the (weighted) number of terms in P of length 
/ containing the term tc- Furthermore let 

m{C;P) = Ym{C,l;P) 
I 

and note that m(0; P) is by definition the total number of terms in P. 



We would like to replace the ||a!i||oo in Azuma's bound by something weaker 
such as an average. Now we define the quantities that will play this important 
role. 

Definition 5.2. Given any C C [N] and P = J2BeA '^{^)^b the partial deriva- 
tive of P with respect to C is given by the polynomial 



and set 



(P) = max{E (dcP) : C C [N], \C\ > j} 

It is clear that if one knows the values of m{C, I; P) for all C, I then one may 
easily compute (P) for all j as 

E {dcP) = ^('^)^ = E ^)^''""'' 

BeA,C<ZB I 

In order to illustrate these definitions, let us go back for a moment to the 
problem of counting triangles in the random graph G{n,p). Recall that we are 
looking at the polynomial Y = X)i<j<fe Ujtjktik that counts the number of tri- 
angles. 
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Set C = {12}, that is the single edge 12. Then 



dcY = ^ tikt2k 

k>2 

SO ■m{C;Y) = n — 2 and E {dC) = p^{n — 2). The former quantity is the 
maximal number of triangles containing the edge 12 and the latter the expected 
number. More generally, if we fix a single edge there are exactly n — 2 triangles 
containing it and if we fix 2 edges there is (at most) 1 triangle containing both 
edges, therefore 

E (dcY) = {n~ 2)p2 for any C with \C\ = 1 

E (dcY) < p for any C with |C| 2 

E (dcY) < 1 for any C with |C| > 3 



Hence, provided that p >> n ^ 

Eo (F) = E (y) = Ei(y) =max{(n-2)p2,l}, E2(y) = 1(5.1) 

We are now ready to state Vu's Theorem on the concentration of multivariate 
Boolean polynomials: 

Theorem 5.3 (V. Vu). Let P be a positive reduced Boolean polynomial of degree 
k. Then for any positive numbers J-q > J-i > . . . > J-^ and X satisfying 

• for all 0<j<k, Tj > Ej (P) 

• for all 0<j<k, T,j /Fj+i > A + 4j log N 

there exist some constants Cfe and dk depending only on k such that the following 
holds: 

P(|P - E (P) I > Ck^XFoTi) < dkc-i 
where the Ej (P) are defined as above. 

We will now apply Vu's Theorem to the triangle counting problem in G{n,p). 
If we let p = e(n-2/3) then, from (5.1), it follows Eq (F) = e(n) and that 
El (r), E2 (Y) < 1. 

Now set J^Q — Cn, T\ = \J Cn and = 1 where C is some sufficiently large 
constant. It is easy to check that for A = a^^/n, where a is a constant chosen 
so that Ck\/XJ^oJ^i < eE (Y), then the meet the requirements of the theorem 
and therefore: 

P(|y - E (r) I > eE (Y)) < Ce^^"'^. 

which is a considerable improvement of the range of p over the bound obtained 
using Azuma's inequality. 
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6 The base case 



Before we tackle the proof of Theorera 4.1 we will show first a weaker result 
which will serve as a nice concrete example as well as paving the way for the 
more general proof. 

Theorem 6.1. Let A be a random subset ofZ^ where each x £ Z„ is chosen 

4 . 

independently with probability p — N^^^'^ for any e > 0. Then 

V{Ali,^^ > for all {a, 6, c)) = 1 - 0^(1) 

One easy observation is that without loss of generality we may assume that 
a, b, c are distinct as otherwise if, say a = 6, we would be ultimately trying to 
show that any induced function (j) must satisfy that 0(0 + d) = 0(0) + (j){d) for 
d = c — a, which is trivially true provided that the function is defined at d. 

We wish to apply Vu's to show that Theorem 6.1 holds. In order to do so 
we must estimate accurately the valuee of all the different partial derivatives; 
in this case the simplest approach is to start by giving an explicit expansion of 
the polynomial ^ : 

=\^A" A° A'' 

a,h.c / J a.h.z a.z.c z.h^c 
z 

= ^ [xi + a, xi + 6, xi + z,X2 + a, X2 + z,X2 + c, X3 + z, 2:3 + b, X3 + c) 

Xi,X2.X3,Z 

where here the 9-tuple {vi, . . . ,vg) is used to denote the term ty^ ■ ■ ■ ty^. 

Let B c [N]] to compute the expected value of 83 A\ h c essentially needs 
to count the number of occurrences of the set B within each of the terms of ^ j,. 

To this end we lay out the following set up: 

1. We begin by identifying the monomials in A^ as points in , parametrized 
by the quadruple {xi, X2,X3, z), via the linear map ip : Zf^ — > : 

{xi,X2,X3, z) i-> {xi+a, xi+b, Xi+z, X2+a, X2+Z, X2+c, x^+z, Xz+b, x^+c) 

2. Pick a point {xi,X2,X3,z) e Zj^ uniformly at random. 

3. Ask what the probability is that B C ip{xi,X2,X3, z) = {vi, . . . , vg) i.e. B 
appears as a (not necessarily ordered) subsequence of (ui, . . . ,vq). 

If we were in an ideal world, to be able to apply Vu's Theorem directly, we 
would have that: 
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• Eo (Ai) = e(7VV) 

• El (Ai) = o(7VV) 

Unfortunately a quick inspection shows that this is not quite the case: suppose 
we choose to take xi — X2 — then the 9-tuple above cohapses into 

P = ^^(a; + a, X + b, X + c, X + z) 

It is easy to see that E (P) = Q.{N'^p^) which is bad news as this is of a 
greater order of magnitude than N^p^ as long as p < N~^/^. In particular for 
p = 7V~5+% provided e is sufBcicntly smah, E (P) > E (A^). 

Furthermore if we take B ~ {a, b, c} then 

E (9sAi) > E > Np 

which again for p = N^a^*^ is of a greater order of magnitude than N'^p^ , or 
N'^P'^ for that matter. 

At this point it may seem as though there is little hope of being able to 
deduce Theorem 6.1 by applying Vu's Theorem. However if we take a closer 
look into P we note that A° ^ > P. In particular P > implies A° ^ > and 
so the triangle [a, 6, c] was already in fact a simplex in the surface. 

Remark 6. Informally speaking what is happening here is that there are some 
inherent degenerate terms in the definition of A^, coming from some specific 
quadruples (xi, a;2, Xa, z), which for a certain range of p become bigger than the 
main term. At first glance this might not seem problematic, since ultimately 
we wish to show that Aj ^ > with very high probability - i.e. decaying 
exponentially in so a priori having big, but nonetheless positive, degenerate 
terms in P should make the task easier. The subtlety is that despite having 
a big expectation P will not be strongly concentrated around its mean and in 
fact P(P > 0) <C P(A^ > 0). For instance, in the above example if wc choose 
p = then E (P) > A^^/iQ ^ ^4^9 P(P > 0) < V{A° > 0) < 

E (A") - 

The way to get around this issue is by restricting our attention to quadruples 
(xi, X2, X3, z) such that the all 9 linear forms take distinct values. 

Definition 6.1. We will say that a quadruple {xi,X2,X3,z) is degenerate if it 
satisfies any non trivial relation r{xi, X2, x^, z, a, b, c) of length at most 4. That 
is to say, we may find yi,?/2,?/3,2/4 G {xi,X2tX3, z,a,b,c} and e { — 1,0,1} 
such that 

ei2/i + £22/2 + £32/3 + £42/4 = 
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Let % to be the set of all non-degenerate quadruples and set 

:= ^ ij;{x-i,X2,X3,z) 

Note that, for any non trivial relation r, the number of quadruples that 
satisfy it is at most , furthermore the number of such relations is bounded 
above by an absolute constant C (here setting C = 2^ 7^ will do) . Therefore the 
number of degenerate quadruples is at most CN^ and the number of monomials 
in is still of order N'^, all of length 9 by definition, as otherwise we would 
obtain some relation r{x\,X2,X3, z, a, b, c) of length 4. 

Now we can easily compute the expectation of A-*^: 



(A^) = 5]E(i^(.)) = e(ivV) 



E 



Before we dive any further into computations we introduce a bit of notation: 
for B c [A''] and Q any Boolean polynomial let 

that is the proportion of monomials in Q containing the term ts- 

In particular, P{B;A^) is precisely the probability that B is contained in 
a monomial of A^ when we pick a non-degenerate quadruple {xi,X2, X3, z) uni- 
formly at random. 



Proposition 6.1. Given any B c [N] with we have that 
P{B;A^) <CN-^^^ if \B\<9 



(6.1) 



< CN-^ if \B\ = 9 
where C is some sufficiently large positive absolute constant. 
Proof. Recall that 

z 

where we write P C Q to mean that every monomial in P is also a monomial 
in Q. 

Hence: 

m(i3;Ai)<^ ^ m(5i; A%_ Jm(52; A0_,,Jm(53; A^^.^J (6.3) 

Z SiUS2US3=S 
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where we write 5i U 5*2 U 5*3 = B to mean that 31,82,83 partition B. Further- 
more, since the number of terms in A-'^ is at least ■^N'^ for sufficiently large A'' 
we certainly have: 

P{B;A')<^Y1 E m;A%,jP(52;A°,,,jP(^3;A%,J (6.4) 

Z SlUS2US3=B 

Fix a partition S'l U 5*2 U 5*3 = _B and note that the number of all such partitions 
is bounded by a constant C independent of A^; C = \B\^ < 9^ will do. 

We look at P{8i; A° ^ ^) for a fixed z £ Zjy. Each monomial in A° ^ is given 
by the triple (x -\- a, x + b, x + z) where x ranges over all values in Z^r. 

It follows that 

P(5i;A%_J<C'Af-r^T if |5i|<3 

<c'N-^ if is^ii^a 

and similarly for P(S'2; A^,^ J and P(5'3; A° fc_J. 

Remark 7. The alert reader would have noticed that the above is a rather con- 
voluted way of saying that P{8i; A^j^ ^) < CN~^ provided 5*1 7^ 0. The reason 
for presenting it in this way is to make the arguments analogous to those in the 
general case. 

We split the sum in 6.4 into two parts and consider each case individually: 

Case 1, \S.j \ < 3 for all j = 1,2,3 : 

Using the inequality in 7.1 on the expression 6.4: 

p |Sll + |S2l + |S3l ^ 



N 

Z SlUS2US3=S 



< E (^')'A^- 

SiUS2US3=-B 

where the second inequality comes from the fact that there are at most C par- 
titions of B as S*! U 52 U 53. 

Case 2, \Sj \ =3 for some j — 1, 2, 3: 

Without loss of generality assume that IS"!] =3. 
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Claim. P(S'i; A° 2) = for all but at most 6 values of z G Z„. 

Let Si — {yi,y2,y3}- Write A° f, = J2xi^ + a,x + b,x + z) and suppose we 
decide to put the assignment 

X + a = yi 
x + b^y2 
x + z = y3 

Then x = yi — a which implies z — y^ + a ~ yi. Thus for any given assignment 
z is fully determined and there is a total of 3! = 6 possible assignments. The 
claim follows. 

Hence from 6.4 again: 

SiU52US3=S 

Now if IS'al = |53| = 3 then 

SiUS2US3=-B 

which covers the case \B\ — 9. 

Otherwise without loss of generality we may assume that 1 5*3 1 < 3 which implies 
P{S,-KI,JP{S,-A%.,) < (CO^iV-r^l-r^Hi . Thus 

SiUS2US3=B 

as required. 

□ 

Proof of Theorem 6.1. Once we have proven Proposition 6.1 it is an easy task 
to compute Ej (j^^^ ■ 

E (9bAi) < CN^P{B;A')p^'\^\ < CiV^-Tf l/^isi 

and therefore, since Np^ ^ 1: 
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(A.) 



Eg AM < C 



We are now well placed to apply Vu's Theorem. Setting J^o = CN^p^, J-j+i = 
^-i/i8jr^. g^j^^ ^ _ ^\iQYe the constants are c and C are appropriately 

chosen the conditions of Vu's theorem are satisfied and 



< Ce 



P(|Ai,,_, - E (a1_,,,) I >eE (a1_,,,)) 
Moreover 

P(Ai f, ^ > for all (a, 6, c)) < 1 - N^Ce-^^'^'"'' ^ 1 - on{1) 

□ 

7 The general case 

Most of the arguments above can be extended to the proof of Theorem 4.1, 
the main obstacle being that in the general case we cannot simply expand out 
Aa 6 c what it looks like and extract the properties we are interested in. 

However we can go around this difficulty by exploiting the recursive nature of 
this polynomial which is very amenable to inductive arguments. 



Proposition 7.1. For each i > there exists a linear map 
such that 

Ki,b,c = ^ tip'{{a,b,c)(Bv) 



ri+l 



where 2d+l^ deg{K\ f,J = 3' 

i+l A„ „„,„;. „,,„„, ^ "7. 



Furthermore we may choose V'*"'"^ in such way that for vi,V2,V3 G and 



(a, &, C) © Vi e V2 ® V3 ® Z V-L^.C^l) ® V'a,z,c(v2) ® V'^,6,c(v3) (7.1) 

For the sake of convenience we shall use b d^) denote ^^{{a, b, c) ® v) 
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Proof. We argue by induction on i. We have already seen in the previous section 
that this holds in the case z = 1 so assume that the result holds for i. 
By the induction hypothesis: 

Ai + l _ \ " Ai Al \i 

z 

= E( E E v4.,c(v2))( e V'ifc,c(v3)) 

^ visz^ V2ez5^ 

E ^a,fc,2(^l)^a,2x(v2)V'^,6,c(v3) 

Vl,V2,V3eZ^ 

Thus taking V'qVc ^ equation (7.1) will do. All that remains to show 

that it is linear: 

V-l+l.h+h'.c+c (vi + v'l e V2 + ® V3 + v;, e Z + z') = 

^^a+a' ,b+b' ,z+z' 

(Vi + v'l ) e i^l+a',z+z'.cc+c' (V2 + V2) i>l+,j .b+b' ,c+c' (^3 + V3) 

Now looking at each component individually; 

by the induction hypothesis and similarly for V2,V3. Hence the above is the 
same as 

and hence ?/;*+^ is linear. 

□ 

As in the previous section, we cannot hope to show that the polynomial 
A^ ^ ^ is strongly concentrated around its expectation because of the presence of 
'noise terms'. We need introduce an analogous notion of degeneracy that gets 
rid of these terms whilst keeping the expectation of the right order. Before we 
do so however we need to prove one more property of b c- 

Proposition 7.2. Let us write V'* : Z^"^ as {Li, L2, . . . , i2d+i) where 

Lj : Z^'^ — > Ztv are linear forms. Then for v = (a, b, c) © (wi, V2, ■ . ■ , Vd) 

for Xj S {a, 6, c, Wi . . . , Vd} and yj G {vi, . . . , Vd}. Furthermore Xj ^ yj and 
L.J ^ Lj, for j ^ /. 

Proof. Unsurprisingly we argue by induction on i. Set 



(Ll, ^2, • 


■ • , L2d^ 


-1) 


= V'((a,&, z)i 


Bvi) 


{L'l , 1/2 , . 


Tl 

■ ■ 7 ^2d^ 


-1) 


= %l;\{a,z,c) i 


B V2) 




T" 

■ ■ 7 ^2d^ 


-1) 


= i;\{z,b,c)Q 


3vi) 
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Then 

^lTh!c(vi ® V2 © V3 ® z) = (Li, . . . , L2d+i) ® (i'l, . . . , iWi) ® (L'l, 

thus, since each of the hnear maps in each component has the desired form by 
induction, we are done. 

Also by induction we cannot have that {Lj : 1 < j < 2d + 1} are distinct 
and similarly for the L'j and L". The only possible equalities are therefore of 
the form Lj = Lj, for instance but this can only happen if Lj is one of a + 6, 
6 + c or a + c and this is not possible by the induction hypothesis. 

□ 

We can now define our notion of degeneracy: 



Definition 7.1. A d-tuple v = {vi, V2, ■ ■ ■ , Vd) is degenerate if it satisfies any 
non trivial relation r{vi, . . . , V2, ■ ■ ■ , Vd, a, 6, c) of length at most 4- 

Let "HJj J, C Zff be the set of all non-degenerate d-tuples and set 

K,b,c ■■= ^a,fc,c(v) 

The number of such relations is bounded by an absolute constant C < 8{d + 
3)^ independent of N and the number of d-tuples satisfying a fixed relation r is 
0{N^'^) . Therefore \Hi \ = fl{N'^). Furthermore it follows from Proposition 7.2 
that for all v e Hi, all 2o? + 1 coordinates of ■0^ j, ^(v) are distinct. Hence 

Proposition 7.3. Given any B C [N] with \B\ <2d+\ we have that 

P{B-Al,^,)<CN-^^^ if \B\<2d+l 
<CN~'^ if \B\=2d+l 

where we recall the probability measure P is defined as 

Proof. We will argue by induction on i. The case i ~ 1 was handled in the 
previous section so assume that the proposition holds for i. Note that if v = 
vi © V2 ® V3 © z is in V-^^f^^ then we must have that vi e y-a^t.z^ '^s G 'Ha,z,c 
and V3 e Hl f,^^, i.e 

'^a^b.c — '^a,b,z ® '^a,z,c ® '^z,b,c 



(7.2) 
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Therefore 

Kb,c';=Y.K:l-K:lc-K',L (7-3) 

z 

Now fix a partition SilA — B. Again the number of such partitions is 

also bounded by a constant C independent of N , a conservative bound would 
be C < \B\^ < (2d+ 1)^ The clearly we have that 

Z 

and thus 

P(B;Al+!c) < C^Y.P(^^-^KbjnS2;AUJP{S,;KAc) (7-4) 

z 

We need to consider three cases: 

Case i, IS'j I < /c = 2d + 1 for aU j = 1, 2, 3. 

By the induction hypothesis P(S'j;A*"i) < CA^ I " 1 and hence by (7.4) 

(7.5) 



_risiii _ris2ii _ri-^3n 
P(B; A^^b^J < CAT I 2 1 AT I 2 1^ I 2 



<cA^-m 

as desired. 

Case ^, |B| < 2d + 1 and \Sj\ = k for some j = 1, 2, 3 
Without loss of generality we may assume that \Si \ = k. 

Claim. P{Si\ K\ ^ 2) = for all but at most k\ values of z. 

Proof. It can be shown by induction that the map v©z V'a & zi^) is injective. 
Let S — {si, S2, . . . , Sfe}, then for each tt : [fc] — > [k] a permutation, the equation 

V'a,&,2(v) = (S7r(l), Sjr(2), . . . , S7r(fc)) 

has at most 1 solution, in particular at most one possible value of z. There is a 
total k\ permutations and hence the claim follows. □ 

By the induction hypothesis P(S'i; A^ b z) — CN^^an so looking back at equa- 
tion (7.4) we have: 

P(i?; A^+M < CN^-'P{S2;~K,z,c)PiS3;K,b,c) 
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On the other hand, since one of |5'2|,|S'3| is smaller than k we certainly have the 



^i_ri^ 1+1 



inequality P(52;A*^_^ JP(S'3; a; J < CiV I ^ I M Substituting back 
in the above: 



^ + i r iS2l + |S3l 1 

' fc + ISal + ISai l r[B|-| 

rial 



<CN \ ^ I = CN- 



Case 3, \Si\ = \S2\ = jS'sl = k. Then \B\ = 3fc = 6d + 3 and the above yields 
which completes the proof. 



□ 



We can now easily estimate Ej |^A^ ^ ^ : 



Proposition 7.4. Suppose that Np^ ^ 1, then for j ^ 1,2, . . . ,2d there exists 
an absolute constant C depending only on d such that 



(a;^,,,) <C(7Vp2)-r4l7VV'+' (7.6) 



and E2d+i (a; 



< C. 

Proof. Let B C [A^] such that \B\ = j <2d+l. By Proposition 7.3, 
E {dsAlb.,) < CP{B;kl,jN''p"'+'-\^\ 

< CN-^^^ ]Sfdp2d+l-\B\ 

If \B\ = 2d + 1 then E (dBK,b.c) < Aa.b.c)^'' ^ CN^'^N'^ = C □ 

Since the above inequality holds for any B C [N] and A'^p'^ ^ 1 equation (7.4) 
follows. 

We are now ready to prove our main result: 

Proof of Theorem 4.1. First we choose i sufficiently large so that 3*^^^ = 2(i+l = 
k>^. Then E (a*) = N'^p^^+i ^ ^'^pk ^ > 

Also note that Np'^ = A^^ and thus by Proposition 7.4 it follows that E,- (AM < 
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Ck for J > 1 for some absolute constant depending only on k. 
Now we apply Vu's Theorem (5.3), setting A — iV^ and 

Clearly Tj > Ej ^A*^ and, provided iV is large enough, Tj/J-j+i ~ N"^/^ > 
A + 4j log N. Hence by Vu's Theorem, 

P(|A^_,_, - E (^Ka,) I > CfcVA^o-Fi) < dfce-^ (7.7) 

where again Ck and dk are constants dependent only on k. Now Cky/XJ^o-^i = 
CfeCfciV'(i-i/'*'=) < Ar«^(i-i/8fc) and dfeC-^ < e-^'^^' again provided is suffi- 
ciently large. 

Thus, 

provided A^ is large enough. In particular it follows that P(A' 
and therefore: 

T{Kac > for all (a, b, c)) >1 - N^e-^''"" 
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